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Introduction. The paper deals with the problem of automated construction of a local area network using tools and 
methods for traffic analysis at the link layer of OSI model. The problem is caused by two factors. These are difficulties 
of the manual determination of the communication between equipment and the lack of physical access to communica- 
tion lines of an already functioning network. The purpose of the work is to reduce the time spent on building a local 
network diagram through automating the process of determining the communication between the equipment. 

Materials and Methods. To solve the set tasks, a method for determining the relative location of devices is proposed. 
The network adapters of a specialized software and hardware complex, which are connected to a communication line 
break at different points of the network, are used in opposite directions. The method used is based on calculations of 
intersections of address sets received from these adapters. The structural schemes of the construction of such a software 
and hardware complex and the requirements for it are given. The methods of obtaining MAC addresses from transit 
packets are described. Examples of libraries of software components for performing this operation are given. The struc- 
ture of a relational database is proposed for storing the received data. The format and content of the fields of its table are 
described. 

Results. Using the developed methods, a typical example of an Ethernet network shows a way to determine the relative 
location of end devices specified by their MAC addresses, as well as at least two switches located between them. The 
signs by which it is possible to judge the presence of switching equipment in a particular segment are determined. A 
method is proposed that enables through using a set of relational operations, to sequentially refine the network topology 
until the required accuracy is achieved. 

Discussion and Conclusions. The results obtained can be used under the administration of large local networks with an 
extensive structure. The proposed approach allows you to reduce the time required for building a scheme. This is possi- 
ble due to the automation of the process of obtaining information about devices operating on the network and their mu- 
tual location. 
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Introduction. Large local area networks are characterized by a complex configuration of physical connections, 
which to a great extent determines the efficiency of their work [1, 2]. In practice, the organization does not always have 
a detailed scheme or other documentation describing the network equipment and its interconnections. This significantly 
complicates the administration procedures and conditions the urgency of the problem of determining the structure of 
connections in the operated network for further construction of the layout and connection of nodes. 

Communication lines are most often hidden behind the elements of the structure or decoration of the building, 
only switchgear is available. In this case, it is impossible to understand to which of the network nodes each connected 
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cable leads. Therefore, the task arises on constructing a network diagram based on the analysis of data obtained from 
the traffic captured at certain points. We are talking about places that are potentially available for connecting additional 
software and hardware that analyze traffic. The objective is to reduce the time spent on building a local network 
scheme. 

The described task is complicated by the fact that all the information related to the functioning of the local 
network belongs to the second (data link) layer of the OSI model, and a significant part of the important data in the 
packet belongs to a higher level — the network [3, 4]. 

Most methods of traffic analysis are designed for processing network-level information [5]. In this regard, 
there is a need to develop methods that allow you to get all the necessary data for building a network diagram from the 
packet headers of the data link layer of the OSI model. On the other hand, network topologies at the data link layer are 
simpler than at the network layer, and they are always strictly regulated by the relevant standards’. 

Materials and Methods. The Ethernet standard, which is widely used for building local computer networks, 
provides for the use of the “tree” topology for organizing connections between nodes [5]. In graph theory, a “tree” is 
defined as a connected circuit-free graph [6]. An important consequence of this definition is that there is one and only 
one path between any pairs of vertices in the tree [7]. This allows you to abandon the search for routes within such a 
network and greatly simplify the operation of the equipment. 

When constructing a graph, the set of its vertices and the connections between them is determined [8]. In 
relation to the network graph, vertices are network hardware. To address it within the local network, the MAC addresses 
assigned by the manufacturer are used. They are unique for each device and have 6 bytes in size [9]. The header of each 
network packet contains two MAC addresses: the sender and the recipient. They do not change during the transmission 
of a packet within the local network, and therefore in the problem under consideration, they can be used to identify 
network nodes. 

When building a network graph, the major difficulty is determining the connections. Each connection links two 
vertices, whose relative location, as noted above, is unknown due to their great distance or hidden telecommunications 
routing. Connections can link devices of different types: switch-computer or switch-switch. The latter constitute the 
data transmission infrastructure and are of the greatest interest in terms of analyzing the network topology. In contrast, 
the connections of switchgear with computers describe the final vertices of the graph. At the same time, computers 
connected to the same switch can be conditionally combined into a group, since their mutual location relative to other 
computers will be the same. As a group, we can also consider larger sets of nodes, including computers connected to 
two or more nearby switches (i.e., those responsible for communication within one floor of a building or several offices 
of one department). In general, nodes that are part of a set should be located closer to each other than to nodes that are 
not part of the set or are part of another set [10]. At the first approximation, the entire local network can be considered 
as such a set, because its nodes are closely interconnected and separated from other networks [11]. 

The research idea is to consistently refine the network topology. For this, we will divide the set of MAC 
addresses of the devices included in it into smaller subsets, up to the definition of groups of computers connected to 
separate switches. 

The division into subsets is performed relative to the points at which a hardware device capable of analyzing 
network packets and extracting address and other information from them is connected to the network. Such a device can 
be a laptop or a single-board computer that can work simultaneously with two network adapters. This will allow them to 
be connected to break the connection. As a result, a part of the network will be connected to each of the two network 
adapters. 

It is important to note the difference between the terms “vertex” and “point”. A vertex is a part of the network 
graph that denotes some equipment: a switch or an end device. A point is the connection point of the specified hardware 
complex, which is always located between two vertices. 

Taking into account the linking of the analyzing device to the connection break, it is required to provide the 
operability of the communication line in which this break occurs. To that end, the network adapters must be connected 
through the operating system. A “bridge” type connection is used, when packets arriving at one of the interfaces are 
transmitted to the other one using the OSI model data link layer mechanisms, that is, without taking into account IP 
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addresses, routing, NAT, etc. This method of organizing the connection is completely transparent to other devices on 
the network, it does not change packets and does not manifest itself in any other way. 

The main task of the device under consideration is to extract MAC addresses from transit packets. At this 
(first) stage of building a local network graph, traffic capture utilities (to write it into a file and analyze) or specialized 
libraries of software components (to analyze traffic in real time) are used [12, 13]. Depending on the operating system, 
the libraries may differ; however, as a rule, they are all based on Pcap (Packet Capture). 

Regardless of the method of obtaining MAC addresses, information on them must be stored in the database. 
Taking into account the previously described features of the building a network diagram, we note the following 
requirements. For each MAC address, additional information is recorded: 

— about the point to which the device that received the MAC address is connected; 

— about the network interface from which the MAC address was received as the sender's address [14]. 

As a result, the database table will be described by relation A with the following scheme: 

A (id, address, point, side). 
Here, id — primary key used only for identifying records in the table; address — MAC address of the device in the 
network extracted from the passing packet; point — network connection point (physical location); side — symbol of the 
network interface that transmitted the packet from which the MAC address was extracted. 

After the formation of the MAC address database for a certain number of traffic capture points, the next stage 
begins — building a network diagram. It is based on information about the distribution of MAC addresses obtained for 
different connection points. Let us denote two arbitrary of them as p, and p>. For each point, two sets of addresses 
should be received, each — from a separate network adapter. Let us denote X and Y — sets of addresses for point p,, Z 


and V — sets of addresses for point p, (Fig. 1). 





Yy: 

BO-0D-3F... 
XxX: B7-DE-09... 
A4-57-F3... C8-24-6F... 
A1-D0-4B... C6-2A-30... 
AE-3A-58... C2-C7-88... 


Sees Communication line 


Connection at point p, 





Z: 

A4-57-F3... 

A1-D0-4B... V: 
AE-3A-58... C8-24-6F... 
BO-0D-3F... C6-2A-30... 
B7-DE-09... C2-C7-88... 





Connection at point p2 


Fig. 1. Distribution of addresses across sets when connecting to different points of the network 


From now on, only the first part is specified for the MAC addresses to shorten the record. Within the 
framework of the example under consideration, it is unique, and this is enough to reflect the work of the method. 

Based on the obtained distribution of addresses across sets corresponding to different network interfaces, it is 
possible to draw initial conclusions about the mutual location of devices. To do this, you need to calculate all possible 
intersections for two points, that is, ¥N Z,XNV, YNZ, YN V. 

It is advisable to calculate intersections by means of a database management system. This is because: 
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— information on the address belonging to a set is stored in the database, 
— operations on sets are supported in relational algebra [15]. 
You need to execute queries equivalent to the following set of expressions: 
XN Z = Madaress (Gpoint = 1. side =1(A)) A (6 point =2 A side = 1 (A))), 
XN V = Tadaress (Spoint = 1 aside = 1(A)) N (Gpoint=2. side = 2 (A))), 
Y 1 Z= [adaress (Gpoint =2 rside = 1(A)) A) (Gpoint= 1 side = 1 (A))), 
Y 1 V= LMeadress (Gpoint = 2. side = 2(A)) N (Gpoint = 1 a side = 2 (A))). 
Research Results. Let us consider an example of the application of the proposed methodology for constructing 
a network topology based on the distribution of sets of MAC addresses shown in Fig. 1. Determine the required 
intersections of the sets: 
X 1 Z= {A4 —57 — F3, Al — DO — 4B, AE —- 3A — 58}, 
XN V=9, 
YN Z= {B0- 0D - 3F, B7—- DE- 09}, 
YN V= {C8 — 24 - 6F, C6 — 2A — 30, C2 — C7 — 88}. 
You can notice that one of the intersections (Y and V) — is an empty set. This result is obtained for oppositely 














directed sides. Accordingly, the other sets (Y and Z), on the contrary, represent the sides directed at each other, and the 
result of their intersection is the addresses located between the measurement points, that is, between p, and p;. 

The remaining intersections represent the addresses located on opposite sides of the measurement points. XY and 
V represent oppositely directed sides. Therefore, the remaining intersection, in which X (that is, X | Z) participates, 
includes addresses located on the side of point p;, YM V — on the side of point p,. Thus, it is possible to make an initial 
conclusion on the relative location of all the addresses obtained under the analysis, as well as on their location relative 
to the measurement points (Fig. 2). 









A4-57-F3... 
A1-D0-4B... 
AE-3A-58.. 













C8-24-6F... 
C6-2A-30... 
C2-C7-88... 







BO-0D-3F... 
B7-DE-09... 


Fig. 2. Mutual arrangement of devices and points 


It should be remembered that the points on this diagram are not network nodes (in particular, switches). 
However, the results obtained allow us to make the following assumption: if a set includes several addresses, it means 
that there is at least one switch inside it. We will substantiate the statement as follows: several computers cannot be 
connected directly; this requires appropriate network equipment (Fig. 3). 





























A4-57-F3,.. = = = = = C2-C7-88... 
Al1-D0-4B... AE-3A-58... BO-0D-3F... B7-DE-09... C8-24-6F...  C6-2A-30... 











Fig. 3. Network scheme 


The scheme shown in Fig. 3 is not final since there may be not one, but several switches inside each of the sets. 
At the next stages of the method, you should perform similar operations for each of the obtained sets, receiving MAC 
addresses at other points in the network. Each new measurement will provide refining the scheme and supplementing it 
with new switching nodes. 

Discussion and Conclusions. A method for efficiently constructing a network scheme is proposed. The 
approach is based on an automated analysis of open information extracted from packets transmitted over the network. 


This technique is an alternative to the physical search for communication lines and the determination of the devices 
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connected by them. The application of the proposed solutions can significantly reduce the time spent by system 
administrators on determining the location of all devices and drawing them on the network diagram. The advantage of 
the method is the possibility of sequential refinement of the topology of network connections to obtain the required 


accuracy. 
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